Generative Model Configuration
In LLM, we can experiment with generative configuration parameters to influence the way that the model makes the final decision about next-word generation
generative configuration parameters for inference
max new tokens
greedy vs. random sampling
top-k sampling vs. top-p samplingfin
- affect the randomness of the output of the softmax layer in Transformer
- temperature =0 -> reliability
- temperature >0 -> variety